Search CORE

146 research outputs found

Phoneme dedicated ANN improves segmental duration model

Author: Freitas Diamantino Silva
Teixeira João Paulo
Publication venue
Publication date: 01/01/2008
Field of study

The Phoneme Dedicated Artificial Neural Network (PDANN) segmental duration model consists of a set of ANNs trained specifically for each phoneme segment in order to avoid miscellaneous influence of different types of phoneme segments. Therefore, each ANN is dedicated to predict the duration of a specific phoneme segment. Objective and subjective measurements of the performance of the PDANN model were compared with those of a typical ANN model using the same input features and database. The results indicate a slight, but clear, perceptually perceived preference towards the PDANN

Biblioteca Digital do IPB

Phoneme dedicated ANN improves segmental duration model

Author: Freitas Diamantino Silva
Teixeira João Paulo
Publication venue
Publication date: 01/01/2008
Field of study

Biblioteca Digital do IPB

Use of phoneme dedicated artificial neural networks to predict segmental durations

Author: Freitas Diamantino Silva
Teixeira João Paulo
Publication venue: University of Patras
Publication date: 01/01/2005
Field of study

The results of two alternative models to predict segmental durations in speech synthesis, both based on Artificial Neural Networks (ANNs) are discussed. The ANN model consists in just one ANN trained to predict the segmental durations for all phonemes. The phoneme dedicated ANN model consists in a set of ANNs, each one dedicated to predict the segmental duration of a specific phoneme. Both models are compared with the same input information extracted from one European Portuguese database. Objective and subjective measurements of performance of both approaches are compared. A slight preference was denoted for the phoneme dedicated ANN model

Biblioteca Digital do IPB

Indoor Sound Based Localization: Research Questions and First Results

Author: Araújo Rui,
Freitas Diamantino
Moutinho João
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/04/2013
Field of study

Part 17: TelecommunicationsInternational audienceThis PhD work has the goal to develop an inexpensive, easily deployable and widely compatible localization system for indoor use, suitable for pre-installed public address sound systems, avoiding costly installations or significant architectural changes in spaces. Using the audible sound range will allow the use of low cost off-the-shelf equipment suitable for keeping a low deployment cost. The state-of-the-art presented in this paper evidences a technological void in low-cost, reliable and precise localization systems and technologies. This necessity was also confirmed by the authors in a previous project (NAVMETRO®) where no suitable technological solution was found to exist to overcome the need to automatically localize people in a public space in a reliable and precise way.Although research work is in its first steps, it already provides a thorough view on the problem while discussing some possible approaches and predicting strategies to overcome the key difficulties. Some experiments were already conducted validating some initial premises and demonstrating how to measure the signal’s time-of-flight necessary to infer on distance calculations

Evaluation of a neural network segmental duration model for Portuguese

Author: Freitas Diamantino Silva
Teixeira João Paulo
Publication venue
Publication date: 01/01/2002
Field of study

This paper presents a segmental duration model, that, as far as the authors know, is the first published for European Portuguese, with objective and subjective evaluations. The model is aimed at TTS applications and is based on an ANN, trained with a resilient back-propagation algorithm. Using a substantial amount of training data and a carefully selected set of input factors, the standard deviation of the error of segmental duration estimations reaches 19 ms and the correlation coefficient goes above 0.9. Several models have been published for other languages with objective and subjective good performances. The methodology of construction of the model, the importance of the used factors and the neural network will be presented, together with the evaluation of the model, allowing a comparison with other models for other languages

Biblioteca Digital do IPB

Modelos morfológicos tridimensionais por IRM do tracto vocal para as principais vogais do Português Europeu

Author: Freitas Diamantino Rui
Ventura Sandra Moreira Rua
Publication venue: Associação Portuguesa de Mecânica Teórica, Aplicada e Computacional
Publication date: 17/11/2013
Field of study

O entendimento da produção da fala tem sido ampla mente procurado, recorrendo à imagem por ressonância magnética (IRM), mas não é totalmente conhecido, particularmente no que diz respeito aos sons do Português Europeu (PE). O principal objectivo deste estudo foi a caracterização das vogais do PE. Com base na IRM recolheram-se conjuntos de imagens bidimensionais, em cinco posições articulatórias distintas, durante a produção sustentada do som. Após extracção de contornos do tracto vocal procedeu-se à reconstrução tridimensional, constatando-se que a IRM fornece in formação morfológica útil e com considerável precisão acerca da posição e forma dos diferentes articuladores da fala

Repositório Científico do Instituto Politécnico do Porto

A new multi-modal database for developing speech recognition systems for an assistive technology application

Author: Freitas Diamantino Silva
Moura A.
Pera V.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2004
Field of study

In this paper we report on the acquisition and content of a new database intended for developing audio-visual speech recognition systems. This database supports a speaker dependent continuous speech recognition task, based on a small vocabulary, and was captured in the European Portuguese language. Along with the collected multi-modal speech materials, the respective orthographic transcription and time-alignment files are supplied. The package also includes data on stochastic language models and the generative grammar associated to the collected spoken sentences. The application addressed by this database, which consists of voice control of a basic scientific calculator, has the particularity of being designed for a person with a specific motor impairment, namely muscular dystrophy. This specificity is a remarkable characteristic, given the lack of such kind of data resources for developing assistive systems based on audio-visual speech recognition technology

Crossref

Biblioteca Digital do IPB

Indirect parameter estimation of continuous-time systems using discrete time data

Author: Araújo Rui Esteves
Freitas Diamantino Silva
Leite V.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/02/2010
Field of study

This paper addresses the problem of parameter estimation of continuous-time systems using samples of its input-output data. We propose a method based on the bilinear transformation to obtain an equivalent discrete-time model. Introducing a new polynomial pre-filter it .is possible to compute the physical parameters via inverse mapping between the discrete-time and the continuous-time models. A simulation example is given to illustrate the noise effects in the parameter estimation results. Using experimental results, we demonstrate the ability of the estimator. to handle real measurement problems

Biblioteca Digital do IPB

Classes of model structures for state and parameter identification of vector controlled induction machines

Author: Araújo Rui Esteves
Freitas Diamantino Silva
Leite V.
Publication venue
Publication date: 06/02/2010
Field of study

The purpose of this paper is to present a synthesis of classes of model structures for joint state and parameter identification of vector controlled induction motors for real time and normal operating conditions. Based on its classical model a set of new classes of model structures is discussed and proposed for simultaneous estimation of rotor flux components and electrical parameters

Biblioteca Digital do IPB

Modelling and simulation of power electronic systems using a bond graph formalism

Author: Araújo Rui Esteves
Freitas Diamantino Silva
Leite V.
Publication venue
Publication date: 06/02/2010
Field of study

This paper deals with the modelling of power electronic systems using the bond graph formalism. The switching components are modelled using an ideal representation so that a constant topology system is obtained. The purpose of the present contribution is to discuss a technique that combines bond graph energy-flow modelling and signal-flow modelling schemes for simulation and prototyping of signal processing algorithms in power electronics systems. In this paper, we will discuss models of the use of fully-controlled, semi-controlled and non-controlled switches in the field of power static converters. By concept, a simulation environment can be examined at different abstraction or hierarchy levels. The approach in this paper is, accordingly, the formulation of a simulation task at different levels: component level, topology level, functional description and implementation description. The paper concludes with two practical examples of simulation of power electronics systems

Biblioteca Digital do IPB